On-Line Approximate String Searching Algorithms: Survey and Experimental Results

نویسندگان

  • Panagiotis D. Michailidis
  • Konstantinos G. Margaritis
چکیده

The problem of approximate string searching comprises two classes of problems: string searching with k mismatches and string searching with k differences. In this paper we present a short survey and experimental results for well known sequential approximate string searching algorithms. We consider algorithms based on different approaches including dynamic programming, deterministic finite automata, filtering, counting and bit parallelism. We compare these algorithms in terms of running time against pattern length and for several values of k for four different kinds of text: binary alphabet, alphabet of size 8, English alphabet and DNA alphabet. Finally, we compare the experimental results of the algorithms with their theoretical complexities.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On-line string matching algorithms: survey and experimental results

In this paper we present a short survey and experimental results for well known sequential string matching algorithms. We consider algorithms based on different approaches including classical, suffix automata, bit-parallelism and hashing. We put special emphasis on algorithms recently presented such as Shift-Or and BNDM algorithms. We compare these algorithms in terms of the number of character...

متن کامل

Practical Methods for Approximate String Matching

Given a pattern string and a text, the task of approximate string matching is to find all locations in the text that are similar to the pattern. This type of search may be done for example in applications of spelling error correction or bioinformatics. Typically edit distance is used as the measure of similarity (or distance) between two strings. In this thesis we concentrate on unit-cost edit ...

متن کامل

Efficient generation of super condensed neighborhoods

Indexing methods for the approximate string matching problem spend a considerable effort generating condensed neighborhoods. Condensed neighborhoods, however, are not a minimal representation of a pattern neighborhood. Super condensed neighborhoods, proposed in this work, are smaller, provably minimal and can be used to locate approximate matches that can later be extended by on-line search. We...

متن کامل

On-Line Approximate String Matching with Bounded Errors

We introduce a new dimension to the widely studied on-line approximate string matching problem, by introducing an error threshold parameter so that the algorithm is allowed to miss occurrences with probability . This is particularly appropriate for this problem, as approximate searching is used to model many cases where exact answers are not mandatory. We show that the relaxed version of the pr...

متن کامل

On-line Approximate String Matching in Natural Language

We consider approximate pattern matching in natural language text. We use the words of the text as the alphabet, instead of the characters as in traditional string matching approaches. Hence our pattern consists of a sequence of words. From the algorithmic point of view this has several advantages: (i) the number of words is much less than the number of characters, which in effect means shorter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. J. Comput. Math.

دوره 79  شماره 

صفحات  -

تاریخ انتشار 2002